22 research outputs found

    Untangling Local and Global Deformations in Deep Convolutional Networks for Image Classification and Sliding Window Detection

    Get PDF
    Deep Convolutional Neural Networks (DCNNs) commonly use generic `max-pooling' (MP) layers to extract deformation-invariant features, but we argue in favor of a more refined treatment. First, we introduce epitomic convolution as a building block alternative to the common convolution-MP cascade of DCNNs; while having identical complexity to MP, Epitomic Convolution allows for parameter sharing across different filters, resulting in faster convergence and better generalization. Second, we introduce a Multiple Instance Learning approach to explicitly accommodate global translation and scaling when training a DCNN exclusively with class labels. For this we rely on a `patchwork' data structure that efficiently lays out all image scales and positions as candidates to a DCNN. Factoring global and local deformations allows a DCNN to `focus its resources' on the treatment of non-rigid deformations and yields a substantial classification accuracy improvement. Third, further pursuing this idea, we develop an efficient DCNN sliding window object detector that employs explicit search over position, scale, and aspect ratio. We provide competitive image classification and localization results on the ImageNet dataset and object detection results on the Pascal VOC 2007 benchmark.Comment: 13 pages, 7 figures, 5 tables. arXiv admin note: substantial text overlap with arXiv:1406.273

    Deformable Part Models with CNN Features

    Get PDF
    International audienceIn this work we report on progress in integrating deep convo-lutional features with Deformable Part Models (DPMs). We substitute the Histogram-of-Gradient features of DPMs with Convolutional Neural Network (CNN) features, obtained from the top-most, fifth, convolutional layer of Krizhevsky's network [8]. We demonstrate that we thereby obtain a substantial boost in performance (+14.5 mAP) when compared to the baseline HOG-based models. This only partially bridges the gap between DPMs and the currently top-performing R-CNN method of [4], suggesting that more radical changes to DPMs may be needed

    Interactions entre rang et parcimonie en estimation pénalisée, et détection d'objets structurés

    No full text
    This thesis is organized in two independent parts. The first part focused on convex matrix estimation problems, where both rank and sparsity are taken into account simultaneously. In the context of graphs with community structures, a common assumption is that the underlying adjacency matrices are block-diagonal in an appropriate basis. However, these types of graphs are usually far from complete, and their adjacency representations are thus also inherently sparse. This suggests that combining the sparse hypothesis and the low rank hypothesis may allow to more accurately model such objects. To this end, we propose and analyze a convex penalty to promote both low rank and high sparsity at the same time. Although the low rank hypothesis allows to reduce over-fitting by decreasing the modeling capacity of a matrix model, the opposite may be desirable when enough data is available. We study such an example in the context of localized multiple kernel learning, which extends multiple kernel learning by allowing each of the kernels to select different support vectors. In this framework, multiple kernel learning corresponds to a rank one estimator, while higher-rank estimators have been observed to increase generalization performance. We propose a novel family of large-margin methods for this problem that, unlike previous methods, are both convex and theoretically grounded. The second part of the thesis is about detection of objects or signals which exhibit combinatorial structures, and we present two such problems. First, we consider detection in the statistical hypothesis testing sense, in models where anomalous signals correspond to correlated values at different sensors. In most existing work, detection procedures are provided with a full sample of all the sensors. However, the experimenter may have the capacity to make targeted measurements in an on-line and adaptive manner, and we investigate such adaptive sensing procedures. Finally, we consider the task of identifying and localizing objects in images. This is an important problem in computer vision, where hand-crafted features are usually used. Following recent successes in learning ad-hoc representations for similar problems, we integrate the method of deformable part models with high-dimensional features from convolutional neural networks, and shows that this significantly decreases the error rates of existing part-based models.Cette thèse est organisée en deux parties indépendantes. La première partie s'intéresse à l'estimation convexe de matrice en prenant en compte à la fois la parcimonie et le rang. Dans le contexte de graphes avec une structure de communautés, on suppose souvent que la matrice d'adjacence sous-jacente est diagonale par blocs dans une base appropriée. Cependant, de tels graphes possèdent généralement une matrice d'adjacente qui est aussi parcimonieuse, ce qui suggère que combiner parcimonie et range puisse permettre de modéliser ce type d'objet de manière plus fine. Nous proposons et étudions ainsi une pénalité convexe pour promouvoir parcimonie et rang faible simultanément. Même si l'hypothèse de rang faible permet de diminuer le sur-apprentissage en diminuant la capacité d'un modèle matriciel, il peut être souhaitable lorsque suffisamment de données sont disponible de ne pas introduire une telle hypothèse. Nous étudions un exemple dans le contexte multiple kernel learning localisé, où nous proposons une famille de méthodes a vaste-marge convexes et accompagnées d'une analyse théorique. La deuxième partie de cette thèse s'intéresse à des problèmes de détection d'objets ou de signaux structurés. Dans un premier temps, nous considérons un problème de test statistique, pour des modèles où l'alternative correspond à des capteurs émettant des signaux corrélés. Contrairement à la littérature traditionnelle, nous considérons des procédures de test séquentielles, et nous établissons que de telles procédures permettent de détecter des corrélations significativement plus faible que les méthodes traditionnelles. Dans un second temps, nous considérons le problème de localiser des objets dans des images. En s'appuyant sur de récents résultats en apprentissage de représentation pour des problèmes similaires, nous intégrons des features de grande dimension issues de réseaux de neurones convolutionnels dans les modèles déformables traditionnellement utilisés pour ce type de problème. Nous démontrons expérimentalement que ce type d'approche permet de diminuer significativement le taux d'erreur de ces modèles

    Interactions between rank and sparsity in penalized estimation, and detection of structured objects

    No full text
    Cette thèse est organisée en deux parties indépendantes. La première partie s'intéresse à l'estimation convexe de matrice en prenant en compte à la fois la parcimonie et le rang. Dans le contexte de graphes avec une structure de communautés, on suppose souvent que la matrice d'adjacence sous-jacente est diagonale par blocs dans une base appropriée. Cependant, de tels graphes possèdent généralement une matrice d'adjacente qui est aussi parcimonieuse, ce qui suggère que combiner parcimonie et range puisse permettre de modéliser ce type d'objet de manière plus fine. Nous proposons et étudions ainsi une pénalité convexe pour promouvoir parcimonie et rang faible simultanément. Même si l'hypothèse de rang faible permet de diminuer le sur-apprentissage en diminuant la capacité d'un modèle matriciel, il peut être souhaitable lorsque suffisamment de données sont disponible de ne pas introduire une telle hypothèse. Nous étudions un exemple dans le contexte multiple kernel learning localisé, où nous proposons une famille de méthodes a vaste-marge convexes et accompagnées d'une analyse théorique. La deuxième partie de cette thèse s'intéresse à des problèmes de détection d'objets ou de signaux structurés. Dans un premier temps, nous considérons un problème de test statistique, pour des modèles où l'alternative correspond à des capteurs émettant des signaux corrélés. Contrairement à la littérature traditionnelle, nous considérons des procédures de test séquentielles, et nous établissons que de telles procédures permettent de détecter des corrélations significativement plus faible que les méthodes traditionnelles. Dans un second temps, nous considérons le problème de localiser des objets dans des images. En s'appuyant sur de récents résultats en apprentissage de représentation pour des problèmes similaires, nous intégrons des features de grande dimension issues de réseaux de neurones convolutionnels dans les modèles déformables traditionnellement utilisés pour ce type de problème. Nous démontrons expérimentalement que ce type d'approche permet de diminuer significativement le taux d'erreur de ces modèles.This thesis is organized in two independent parts. The first part focused on convex matrix estimation problems, where both rank and sparsity are taken into account simultaneously. In the context of graphs with community structures, a common assumption is that the underlying adjacency matrices are block-diagonal in an appropriate basis. However, these types of graphs are usually far from complete, and their adjacency representations are thus also inherently sparse. This suggests that combining the sparse hypothesis and the low rank hypothesis may allow to more accurately model such objects. To this end, we propose and analyze a convex penalty to promote both low rank and high sparsity at the same time. Although the low rank hypothesis allows to reduce over-fitting by decreasing the modeling capacity of a matrix model, the opposite may be desirable when enough data is available. We study such an example in the context of localized multiple kernel learning, which extends multiple kernel learning by allowing each of the kernels to select different support vectors. In this framework, multiple kernel learning corresponds to a rank one estimator, while higher-rank estimators have been observed to increase generalization performance. We propose a novel family of large-margin methods for this problem that, unlike previous methods, are both convex and theoretically grounded. The second part of the thesis is about detection of objects or signals which exhibit combinatorial structures, and we present two such problems. First, we consider detection in the statistical hypothesis testing sense, in models where anomalous signals correspond to correlated values at different sensors. In most existing work, detection procedures are provided with a full sample of all the sensors. However, the experimenter may have the capacity to make targeted measurements in an on-line and adaptive manner, and we investigate such adaptive sensing procedures. Finally, we consider the task of identifying and localizing objects in images. This is an important problem in computer vision, where hand-crafted features are usually used. Following recent successes in learning ad-hoc representations for similar problems, we integrate the method of deformable part models with high-dimensional features from convolutional neural networks, and shows that this significantly decreases the error rates of existing part-based models

    Detection of Correlations with Adaptive Sensing

    Get PDF
    The problem of detecting correlations from samples of a high-dimensional Gaussian vector has recently received a lot of attention. In most existing work, detection procedures are provided with a full sample. However, following common wisdom in experimental design, the experimenter may have the capacity to make targeted measurements in an on-line and adaptive manner. In this work, we investigate such adaptive sensing procedures for detecting positive correlations. It it shown that, using the same number of measurements, adaptive procedures are able to detect significantly weaker correlations than their non-adaptive counterparts. We also establish minimax lower bounds that show the limitations of any procedure. Index Terms sequential testing, adaptive sensing, sparse covariance matrices, sparse principal component analysis, high-dimensional detection I
    corecore